Undergraduate Dissertation

Abstract

Argumentation Mining (AM) enables us to reason with natural language; effective AM depends on our ability to accurately detect claims in text. In structured texts, the claim detection performance of the state of the art model, ’BERT’, is well understood. But much less is understood when BERT is applied to less structured text, such as social media, which is more indicative of “real world natural language”. We compare BERT’s performance in classical structured texts with that of semi-structured texts. Then study the performance improvements obtained by pre-training BERT in the same domain before training BERT for claim detection. Overall, we have found that BERT performs well on semi-structured text, but pre-training in the domain is not necessary to obtain good performance. This work can be continued through the use of classical argumentation mechanisms to relate claims to one another for effective argumentation mining from social media.

Download the Report Code